A method for binning time-of-flight data

ABSTRACT

The invention relates to a method for binning TOF data from a scene, for increasing the accuracy of TOF measurements and reducing the noise therein, the TOF data comprising phase data and confidence data, the method comprising the steps of acquiring a plurality of TOF data by illuminating the scene with a plurality of modulated signals; associating each modulated signal with a vector defined by a phase and a confidence data, respectively; adding the plurality of vectors for obtaining a binned vector; determining the phase and confidence of the binned vector; processing the phase and confidence data of the binned vector for obtaining depth data of the scene.

TECHNICAL FIELD OF THE INVENTION

The invention relates to a method for binning Time-Of-Flight data. In particular, the invention relates to a method for performing more accurate Time-Of-Flight measurements while minimizing the noise.

BACKGROUND OF THE INVENTION

Time-Of-Flight technology (TOF) is a promising technology for depth perception. The well-known basic operational principle of a standard TOF camera system 3 is illustrated in FIG. 1. The TOF camera system 3 captures 3D images of a scene 15 by analysing the time of flight of light from a dedicated illumination unit 18 to an object. TOF camera system 3 includes a camera, for instance a 3D sensor 1 and data processing means 4. The scene 15 is actively illuminated with a modulated light 16 at a predetermined wavelength using the dedicated illumination unit 18, for instance with some light pulses of at least one predetermined frequency. The modulated light is reflected back from objects within the scene. A lens 2 collects the reflected light 17 and forms an image of the objects onto the imaging sensor 1 of the camera. Depending on the distance of objects from the camera, a delay is experienced between the emission of the modulated light, e.g. the so called light pulses, and the reception at the camera of those reflected light pulses. Distance between reflecting objects and the camera may be determined as function of the time delay observed and the speed of light constant value.

The distance of objects from camera can be calculated as follows. For clarity purposes, an example of signals is given in FIG. 2. A modulation signal S 16 is sent towards an object. After reflection on the object, a signal

S_(c) 17 is detected by a photodetector. This signal S₁₀₀ is phase-shifted by a phase φ compared to the original signal S, due to the travelling time. For instance, if the signal S 16 is a sinusoidal wave of the form:

S=A cos(2πft)   (eq. 1)

then, S_(φ) can be seen as a phase-shifted wave with the following mathematical form:

S _(φ) =A cos(2πft+φ)=A cos(2πft)cos(φ)−A sin(2πft)sin(φ).   (eq. 2)

By defining the so-called in-phase I and quadrature Q components by:

I=A cos(φ) and Q=A sin(φ)   (eq. 3, 4)

then S_(φ)can be written as

S _(φ) =I cos(2πft)−Q sin(2πft).   (eq. 5)

This equation enables representing S_(φ) in its polar form, as a vector, represented in FIG. 3, with φ being the phase of S_(φ) and r being a parameter corresponding to the amplitude A of the signal S_(φ) and being also related to the so-called confidence.

φ, I and Q are key parameters for measuring the distance of objects from camera. To measure these parameters, the photodetected signal S_(φ) is usually correlated with electrical reference signals named S_(I), S_(Ī), S_(Q) and S _(Q) . S_(I), S_(Ī), S_(Q) and S _(Q) are electrical reference signals shifted by 0°, 180°, 90° and 270° respectively, compared to the original optical signal S, as illustrated in FIG. 2. The correlation signals obtained are defined as follows:

S _(φ,t) =S _(φ) ·S _(I)

S _(φ,Ī) =S _(φ) ·S _(Ī)

S _(φ,Q) =S _(φ) ·S _(Q)

S _(φ,Q) =S _(φ) ·S _(Q) .   (eq. 6-9)

Then, the two parameters I and Q can be calculated such that:

I=A _(S)·α·(S _(φ,I) −S _(φ,Ī)) and

Q=A _(S)·α·(S _(φ,Q) −S _(φ,Q) ).   (eq. 10-11)

A_(S) and α are, respectively, the amplitude change of the photodetected signal S_(φ) and the efficiency of the correlation.

The extraction of φ depends on the shape of the modulation signal S. For example, if S is a sine wave, then

$\begin{matrix} {\phi = \left\{ \begin{matrix} {\arctan \frac{Q}{I}} & {{{if}\mspace{14mu} I},{Q \geq 0}} \\ {{\arctan \frac{Q}{I}} + \pi} & {{{if}\mspace{14mu} I} < 0} \\ {{\arctan \frac{Q}{I}} + {2\pi}} & {{{{if}\mspace{14mu} Q} < 0},{I \geq 0}} \end{matrix} \right.} & \left( {{{eq}.\mspace{14mu} 12}\text{-}14} \right) \end{matrix}$

Once the phase is known, the distance D_(φ) of objects from camera can be retrieved thanks to the following formula:

$\begin{matrix} {D_{\phi} = \frac{c \cdot \left( {\phi + {2{\pi \cdot n}}} \right)}{4{\pi \cdot f_{mod}}}} & \left( {{eq}.\mspace{14mu} 15} \right) \end{matrix}$

where f_(mod) is the modulation frequency and n is an integer number.

In prior art, data binning is a technique used for reducing the noise of images. It's a data pre-processing technique used to reduce the effect of minor observation errors. The original data values which fall in a given small interval, a bin, are replaced by a value representative of that interval, often a central value.

In the context of image processing, binning is the procedure of combining different image data into one single image data. The binning can be temporal or spatial. For temporal binning, one single pixel acquires data at different moments in time and the acquired data are combined to form on single data representative of an interval of time. For spatial binning, the data acquired by a plurality of pixels, at one single moment in time, are combined to form one single data representative of a spatial interval. For instance, an array of 4 pixels becomes a single larger pixel, reducing the overall number of pixels. This aggregation, reducing the number of data, facilitates the analysis. Binning the data may also reduce the impact of read noise on the processed image.

In the context of Time-Of-Flight measurements, binning techniques have been implemented but very often, these methods are not accurate. The phase cp is often obtained by performing a series of measurements and averaging the measured phase, as it will be explained in the following paragraphs.

US patent application no. U.S. 2014/049767 A1 is a prior art reference related to the present invention.

In order to obtain accurate measurement, it is important to ensure a correct measurement of parameters I and Q. A solution remains to be proposed in order to improve the accuracy of these measurements.

SUMMARY OF THE INVENTION

The present invention relates to a method for binning Time-Of-Flight data, the Time-Of-Flight data comprising phase data and confidence data, according to claim 1.

This method enables to reduce considerably the noise of the Time-Of-Flight measurements. The binned phase φ_(f) obtained with the present invention is also more accurate.

Advantageously, the binning is either temporal, when the plurality of TOF data are acquired at different times, or spatial, when the plurality of TOF data are acquired by different photo-sensitive elements, or a combination of both spatial and temporal binning.

Preferably, the method further comprises the steps of predetermining a confidence target parameter and determining the number of TOF data to be acquired, such that the confidence of the binned vector reaches the predetermined confidence target parameter.

More preferably, when the binning is temporal, the method further comprises the steps of predetermining a movement threshold; detecting movement of the scene; and stopping adding TOF data if the detected movement of the scene is above the predetermined movement threshold. This movement threshold ensures that the measurement remains accurate.

More advantageously, when the binning is spatial, the method further comprises the steps of predetermining a depth threshold; detecting the depth of the scene; and stopping adding TOF data if the detected depth of the scene is above the predetermined depth threshold.

Other advantages and novel features of the invention will become more apparent from the following detailed description when taken in conjunction with the accompanying drawings.

BRIEF DESCRIPTION OF THE DRAWINGS

The present invention shall be better understood in light of the following description and the accompanying drawings.

FIG. 1 illustrates the basic operational principle of a TOF camera system;

FIG. 2 illustrates an example of signals used to determine correlation measurements in a ToF system;

FIG. 3 illustrates the polar form of a reflected signal S_(φ);

FIG. 4 illustrates a prior art method of phase binning;

FIG. 5 illustrates a vector addition according to an embodiment of the invention;

FIG. 6 is a graph comparing the noise obtained with the method of the present invention with the ones obtained with prior art methods.

Advantages and novel features of the invention will become more apparent from the following detailed description when taken in conjunction with the accompanying drawing.

DESCRIPTION OF THE INVENTION

In prior art, binning techniques have been implemented to reduce the noise of Time-Of-Flight measurements. One of these techniques is represented in FIG. 4. Each vector {right arrow over (S₁)}, {right arrow over (S₂)} and {right arrow over (S₃)} corresponds to a reflected modulated signal S₁, S₂ and S₃ , with a phase φ₁, φ₂ and φ₃ and a norm (or confidence) r₁, r₂ and r₃, respectively. These 3 vectors can be obtained by 3 different pixels at the same time (for spatial binning) or by one single pixel at 3 different times (for temporal binning). For illustration purposes, only 3 vectors {right arrow over (S₁)}, {right arrow over (S₂)} and {right arrow over (S₃)} have been represented, but the method is commonly used with much more signals and corresponding vectors.

In prior art, the phase binning is performed in a very simply manner. The binned phase φ_(m) is simply the average of φ₁, φ₂ and

${\phi_{3}\mspace{14mu} {i.e.\mspace{11mu} \phi_{m}}} = {\frac{\phi_{1} + \phi_{2} + \phi_{3}}{3}.}$

The distance of objects of a scene are then calculated from this averaged phase φ_(m), by using equation 15 for instance.

Another equivalent method is to calculate 3 depth D₁, D₂ and D₃ from φ₁, φ₂ and φ₃ and then to perform the average of D₁, D₂ and D₃ for obtaining the binned depth D_(m).

In the present invention, a more precise technique to calculate the binned phase is provided. This technique is represented in FIG. 5. Here again, the data of only 3 signals S₁, S₂ and S₃ are combined, or binned, for clarity purposes, but the invention is not limited thereto and can be implemented with any number of signals.

The first step of the method is to acquire a plurality of Time-Of-Flight data by illuminating a scene with a plurality of modulated signals. By Time-Of-Flight data, it is meant the phase and the norm, or confidence, of the signal reflected from a scene. This acquisition can be performed with a lot of different techniques known in prior art, such as correlation for instance.

Then, once the phase and confidence of the modulated signals are known, each Time-Of-Flight data is associated or represented by a vector defined by a phase and a confidence data, respectively. In FIG. 5, 3 vectors {right arrow over (S₁)}, {right arrow over (S₂)} and {right arrow over (S₃)} are represented.

Each vector {right arrow over (S₁)}, {right arrow over (S₂)} and {right arrow over (S₃)} corresponds to a modulated signal S₁, S₂ and S₃ , with a phase φ₁, φ₂ and φ₃ and a norm or confidence r₁, r₂ and r₃, respectively. These 3 vectors can be obtained by 3 different pixels at the same time (spatial binning) or by one single pixel at 3 different times (temporal binning).

Then, the method of the present invention consists in performing a vector addition of the 3 vectors for obtaining what we could call “a binned vector” i.e. the vector obtained by adding, or binning, the vectors associated to the modulated signals. Each vector {right arrow over (S₁)} can be associated to a complex exponential of the form r_(i)e^(iφ) ^(i) .

Once the vector addition has been performed, the binned vector {right arrow over (S_(f))} can be associated to a complex exponential of the form r_(f)e^(iφ) ^(f) with r_(f)e^(iφ) ^(f) =r₁e^(1φ) ¹ +r₂e^(iφ) ² +r₃e^(iφ) ³ and it is possible to determine the phase φ_(f) and the confidence r_(f) of this binned vector {right arrow over (S_(f))}.

This phase and confidence of the binned vector {right arrow over (S_(f))} are finally used to for obtaining depth data of the scene. The phase φ_(f) can be for instance introduced in equation 15 for determining distance parameters.

The binning can be either temporal or spatial, or a combination of both:

-   For temporal binning, the TOF data to be combined are acquired at     different instant in time, i.e. at different frame; -   For spatial binning, the TOF data to be combined are acquired at the     same instant but by different photo-sensitive elements, e.g.     different pixel of a TOF camera;

One of the advantages of the present invention is the following. In practical situation, when measuring a distance of an object from a Time-Of-Flight camera system, it is extremely rare to obtain a configuration where

$\phi_{f} = {\frac{\phi_{1} + \phi_{2} + \phi_{3}}{3}.}$

Hence, the binned phase φ_(f) obtained with the present invention is more accurate and enables to reduce the noise of the measurement.

In FIG. 6, a graph comparing the noise obtained with prior art methods with the one obtained with the present invention is presented.

The graph shows the amount of noise on Y axis as a function of the confidence (or norm) on X axis. The data represented by a circle show the

Raw depth values, i.e. without binning. The data represented by a cross correspond to depth values for which a Binning method have been applied in the depth domain, i.e. by a prior art method. Finally, the data represented by a thicker and darker cross correspond to depth values for which the method of IQ binning of the present invention has been applied.

This graph demonstrates that the higher is the confidence, the lower is the noise on Raw data. The present invention enables to reduce the noise on the entire confidence range by a factor 2 while binning in depth domain, i.e. a prior art method, increases the noise at low confidence.

Further steps may be implemented on the method. In one embodiment, the method may further comprise the steps of:

-   predetermining a confidence target parameter, for instance a     threshold value; -   determining the number of TOF data to be acquired, such that the     confidence of the binned vector reaches the predetermined confidence     target parameter.

For temporal binning, the method may further comprise the steps of:

-   predetermining a movement threshold; -   detecting movement of the scene with respect to the movement     threshold; -   stopping adding Time-Of-Flight data over time if the detected     movement of the scene is above the predetermined movement threshold. -   The movement detection can be performed by several methods known in     prior art. For instance, if the predetermined movement threshold is     25 cm, then if a movement of 50 cm is detected, then the adding and     averaging of data is stopped and a new series of acquisition starts.     In this way the resuting video stream secures motion robustness     while temporally filtering the non-moving parts of the scene. Other     ways of detecting movement can rely on changes in confidence and/or     other sensors present, for example RGB sensors or accelerometers.

For spatial binning, the method may further comprise the steps of:

-   predetermining a depth threshold; -   detecting the depth of the scene; -   stopping adding Time-Of-Flight data if the detected depth of the     scene is above the predetermined depth threshold.

This depth threshold criteria can be implemented in various ways. It can be a simple comparison with the threshold, but can also be an iterative process to identify intelligent binning zones (cfr. Superpixels). The end-goal is to preserve edges present in the scene and add the time-of-flight data together to do binning on the parts within one zone at a similar distance. 

1. A method for binning Time-Of-Flight (TOF) data from a scene, for increasing the accuracy of TOF measurements and reducing the noise therein, the TOF data comprising phase data and confidence data, the method comprising the steps of: acquiring a plurality of TOF data by illuminating the scene with a plurality of modulated signals (S₁, S₂ and S₃); the method being characterized by the further steps of: associating each modulated signal with a vector ({right arrow over (S₁)}, {right arrow over (S₂)} und {right arrow over (S₃)}) defined by a phase (φ̂φ₂ and φ₃) and a confidence data (r₁; r₂ and r₃), respectively; adding the plurality of vectors for obtaining a binned vector {right arrow over (S)}_(f); determining the phase (φ_(f)) and confidence (r_(f)) of the binned vector ({right arrow over (S)}_(f)); processing the phase (φ_(f)) and confidence data (r_(f)) of the binned vector ({right arrow over (S)}_(f)) for obtaining depth data (D) of the scene.
 2. The method of claim 1, wherein the plurality of TOF data are acquired at different times for performing temporal binning.
 3. The method of claim 1, wherein the plurality of TOF data are acquired by different photo-sensitive elements for performing spatial binning.
 4. The method of claim 1, wherein the binning is a combination of both temporal and spatial binning.
 5. The method of claim 1, further comprising the steps of: predetermining a confidence target parameter;—determining the number of TOF data to be acquired, such that the confidence of the binned vector ({right arrow over (S)}_(f)) reaches the predetermined confidence target parameter.
 6. The method of claim 2, further comprising the steps of: predetermining a movement threshold; detecting movement of the scene; stopping adding TOF data if the detected movement of the scene is above the predetermined movement threshold.
 7. The method of claim 3, further comprising the steps of: predetermining a depth threshold; detecting the depth of the scene; stopping adding TOF data if the detected depth of the scene is above the predetermined depth threshold. 